Model Selection

Unsupervised Learning

# Unsupervised Learning

Cityscapes Semantic Eomt Large 1024

This model reveals the potential of Vision Transformer (ViT) in image segmentation tasks by transforming ViT into an efficient image segmentation model through specific methods.

Image Segmentation

ChronoDepth is a temporally consistent video depth learning method based on video diffusion priors, capable of learning and predicting depth information from videos.

Dust3r ViTLarge BaseDecoder 512 Linear

DUSt3R is a deep learning model for generating 3D geometric models from images, capable of easily handling geometric 3D vision tasks.

Dust3r ViTLarge BaseDecoder 224 Linear

DUSt3R is a model for easily achieving geometric 3D vision from images, capable of reconstructing 3D scenes from single or multiple images.

Dpt Dinov2 Giant Kitti

DPT framework using DINOv2 as the backbone network for depth estimation tasks.

Dpt Dinov2 Large Kitti

This model employs the DPT framework with DINOv2 as the backbone network, focusing on depth estimation tasks.

Dpt Dinov2 Base Nyu

A DPT model using DINOv2 as the backbone network for depth estimation tasks.

UMT5 is a multilingual text generation model pretrained on the mC4 multilingual corpus, supporting 107 languages and optimized for language balance using the UniMax sampling strategy

Large Language Model

Transformers Supports Multiple Languages

A multilingual text generation model pretrained on the mC4 multilingual corpus, supporting 107 languages

Large Language Model

Transformers Supports Multiple Languages

A unified multilingual T5 model pre-trained on the mC4 multilingual corpus, covering 107 languages

Large Language Model

Transformers Supports Multiple Languages

This is a sentence embedding model based on sentence-transformers, capable of mapping sentences and paragraphs into a 768-dimensional vector space, suitable for tasks such as sentence similarity calculation and semantic search.

Congen TinyBERT L4

A sentence embedding model based on ConGen, capable of mapping sentences to a 312-dimensional vector space, suitable for tasks like semantic search.

Sup SimCSE VietNamese Phobert Base

SimeCSE_Vietnamese is a Vietnamese sentence embedding model based on SimCSE, using PhoBERT as the pretrained language model, suitable for both unlabeled and labeled data.

Transformers Other

BiMeanVAE is a model based on Variational Autoencoder (VAE), primarily used for text summarization tasks.

Text Generation

Transformers English

Gpt2 Chinese Cluecorpussmall

Chinese GPT2-distil model, pretrained on CLUECorpusSmall dataset, suitable for Chinese text generation tasks

Large Language Model Chinese

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase